b"     Evaluation of the Railroad Retirement Board\xe2\x80\x99s Disaster Recovery Plan \n\n                       Report No. 06-08, August 14, 2006\n\n\n                                   INTRODUCTION \n\n\nThis report presents the results of the Office of Inspector General\xe2\x80\x99s evaluation of\nthe Railroad Retirement Board\xe2\x80\x99s (RRB) Disaster Recovery Plan (DRP).\n\nBACKGROUND\n\nThe RRB\xe2\x80\x99s mission is to administer retirement and survivor insurance benefit\nprograms for railroad workers and their families under the Railroad Retirement Act.\nThe RRB also administers unemployment and sickness insurance benefit\nprograms under the Railroad Unemployment Insurance Act. During Fiscal Year\n2005, the RRB paid approximately $9 billion in age and service benefits to retired\nworkers and their families.\n\nA DRP applies to major, usually catastrophic, events that deny physical or remote\naccess to the normal facility for an extended period. The DRP includes a\ncontinuity of operations plan that focuses on restoring an organization\xe2\x80\x99s essential\nfunctions at an alternate site and performing those functions for a period of time\nbefore returning to normal operations.\n\nThe Office of Administration oversees and coordinates overall disaster recovery\nplanning for the agency. The Bureau of Information Services (BIS) shares\nresponsibility for plans relating to information technology systems. Over the past\nseveral years, the Information Resources Management component of BIS has\ntaken the lead on disaster planning, with involvement by other units and the\nExecutive Committee.\n\nThe RRB published a DRP on December 23, 2003. This plan, prepared with the\nassistance of Science Applications International Corporation (SAIC), outlines\nprovisions for:\n\n\xe2\x80\xa2\t       Recovery and continuity of critical business functions performed by\n         agency bureau and offices immediately following a major disruption or\n         disaster, and\n\n\xe2\x80\xa2\t       Reconstitution of full normal operations when conditions permit return to\n         original, or replacement, primary facilities.\n\nThis plan recognized that a disaster affecting RRB Headquarters (HQ) could\nextensively impact operations, especially if the information technology (IT)\ninfrastructure were lost or significantly disrupted. The vast majority of the critical\nbusiness functions performed by the agency are dependent on applications\nmaintained on the RRB\xe2\x80\x99s mainframe. Likewise, the RRB\xe2\x80\x99s local area network is\nessential for providing connectivity between Field Office networks, HQ user\nworkstations and the RRB\xe2\x80\x99s mainframe computer.\n\n\n                                           1\n\n\x0cThe RRB\xe2\x80\x99s DRP is an IT-focused plan designed to restore operability of the target\nsystem, application, or computer facility at an alternate IT site after an emergency.\nThe RRB has established several key recovery objectives that include restoring\nwithin 15 days the functions most critical to accomplishing the agency\xe2\x80\x99s\ncontingency planning objectives. Additional functions will be restored within 30\ndays to ensure backlogs do not become unmanageable.\n\nOBJECTIVE, SCOPE AND METHODOLOGY\n\nThe objective of this review was to determine if the agency\xe2\x80\x99s DRP provided\nreasonable assurance that the agency will be able to recover from a major\ndisruption or disaster and continue critical business functions, and reconstitute full\nnormal operations within established timeframes.\n\nThe scope of this review included the agency\xe2\x80\x99s most recent DRP, and any agency\nactivity and testing that has been done.\n\nTo accomplish the objective, we:\n\n\xe2\x80\xa2\t     reviewed the DRP prepared by SAIC;\n\xe2\x80\xa2\t     reviewed outstanding recommendations made by SAIC;\n\xe2\x80\xa2\t     reviewed pertinent Federal laws, policies and background information as\n       they related to the objective;\n\xe2\x80\xa2\t     conducted meetings with RRB officials to discuss agency policies and\n       procedures;\n\xe2\x80\xa2\t     reviewed results of past emergency preparedness exercises for type,\n       frequency, and thoroughness;\n\xe2\x80\xa2\t     assessed efforts to mitigate business disruption risks and to ensure\n       adequate disaster preparation; and\n\xe2\x80\xa2\t     identified opportunities for improvement.\n\nThis review was conducted in accordance with generally accepted government\nauditing standards applicable to the objective. The fieldwork was performed at the\nRRB headquarters in Chicago, Illinois from October 2005 through June 2006.\n\n\n                               RESULTS OF REVIEW \n\n\nThe agency\xe2\x80\x99s DRP provides assurance that major information technology\nfunctions would be operational in the event of a disaster. But, this assurance is\nbased on the RRB having access to a Chicago area offsite disaster recovery site.\nThe RRB is not guaranteed access to this site and should address this risk and\nplan accordingly. In addition, the RRB has not tested portions of the DRP related\nto reconstitution of operations. Because of these vulnerabilities and other\nconcerns presented in this report, the RRB does not have a reasonable assurance\nthat it will be able to recover from a major disaster and perform its critical business\nfunctions in a timely manner.\n\n\n                                           2\n\n\x0cRECOVERY SITE\n\nIn the event a disaster renders the RRB headquarters unusable, the DRP calls for\nrelocating critical business functions to an offsite recovery facility in the Chicago\narea. When the DRP was developed, a concern was raised that RRB\xe2\x80\x99s contract\nfor this offsite facility did not provide the RRB with priority use of the site, and it\nwas questionable whether it would be available for RRB use under \xe2\x80\x9cSeptember 11,\n2001, terrorist attack\xe2\x80\x9d conditions. In response to this concern, the RRB replied\nthat the contractor reported they have had a 100% success rate in over 1,500\ndisaster declarations including many multiple disaster scenarios during the 9/11\nterrorist attacks. They claimed to have supported over 90 disaster declarations as\na result of the attacks.\n\nEven with the contractor\xe2\x80\x99s assurances, the RRB is at risk of not having access to\nthe Chicago area offsite recovery facility in the event of a catastrophic event. The\nRRB has access to the offsite facility on a first-come, first-served basis. If the\noffsite facility is unavailable, the contract provides for an alternate facility located in\nPhiladelphia, PA. The RRB has not done any testing at this alternate facility. The\nRRB\xe2\x80\x99s DRP does not provide for contingencies in the event this alternate facility is\nrequired. These contingencies would include transportation and housing of\nemployees and the impact traveling to this alternate site would have on recovery\ntimeliness.\n\nRecommendation\n\nWe recommend that the Chief Information Officer (CIO) address the risk of denied\naccess to the Chicago area disaster recovery site, and identify actions the RRB\nwould need to take in the event this disaster recovery site is not available. The\nCIO should also address the use of the Philadelphia facility to include verifying and\nevaluating the adequacy of this secondary location and exploring housing issues\n(Recommendation #1).\n\nManagement\xe2\x80\x99s Response\n\nManagement concurs with the recommendation. Management will update the\nBusiness Continuity Plan (BCP) with additional documentation regarding the\nalternate SunGard back-up site describing the secondary Philadelphia facility and\npotential vicinity housing. The target date for the revised BCP is December 1,\n2006.\n\nThe full text of management\xe2\x80\x99s response is included as an appendix to this report.\n\n\nTESTING\n\nThe RRB\xe2\x80\x99s current DRP was created in 2003 with the help of an outside company.\nThe RRB adequately tests the recovery phase of the DRP but needs to expand\ntesting of the other phases.\n\n\n                                             3\n\n\x0cRRB\xe2\x80\x99s DRP identifies the following three phases of the contingency\nplanning/emergency response cycle:\n\n   \xe2\x80\xa2\t Notification and Activation Phase - notifying key personnel of an incident,\n      assessing conditions and, if warranted, activating contingency plans;\n\n   \xe2\x80\xa2\t Recovery Phase - recovery of critical business functions generally at an\n      alternate site location; and\n\n   \xe2\x80\xa2\t Reconstitution Phase - documenting in detail the damage done to the\n      primary facility, developing a plan for its repair or replacement,\n      accomplishing the required restoration or replacement, and reconstituting\n      full normal operations at the original or new permanent facility.\n\nThe plan calls for testing and exercises to verify the completeness and workability\nof the plan, identify needed revisions to plan procedures, determine the adequacy\nof training, and identify revisions to training policies and procedures. Without\nperiodic testing, there are no assurances that equipment and procedures are\nmaintained in a constant state of readiness.\n\nRRB policy provides that testing can be done at any of three levels:\n\n   \xe2\x80\xa2\t A Level 1 test checks the adequacy of a particular procedure or aspect of\n      the plan, without actually performing the procedure. For example, a Level 1\n      test of off-site storage procedures would concentrate on the availability of\n      the files and documentation needed for recovery.\n\n   \xe2\x80\xa2\t A Level 2 test checks the workability and adequacy of recovery and/or\n      business resumption procedures by actually performing the procedure in\n      house. For example, a Level 2 test of off-site storage procedures would\n      involve system restoration using in-house systems, recovery personnel, and\n      off-site files and documentation.\n\n   \xe2\x80\xa2\t A Level 3 test checks on the workability and adequacy of recovery and/or\n      business resumption procedures by actually performing the procedure at\n      the backup site. It checks adequacy of the backup facility and\n      management's ability to control and direct the recovery process outside the\n      normal setting. For example, a Level 3 test on off-site storage procedures\n      would involve system restoration and processing of contingency\n      applications at the backup site.\n\nRRB policy provides that each task force and committee of the recovery\norganization is required to test at least twice a year with one test at a Level 2 or\nhigher. The scope of the testing can be focused on a specific aspect of the plan,\nseveral related aspects, or all aspects. Bureau/Office business resumption plans\nare to be tested at least twice a year, with the same level of testing and scope as\nfor recovery organization testing.\n\n\n\n                                          4\n\n\x0cDue to limited resources and testing time constraints, the RRB does not test the\nentire DRP, but primarily tests the Recovery Phase. The RRB contracts for two\ntests a year at a Chicago area offsite facility. These tests are allotted 24 hours\neach at the offsite facility. The RRB has prioritized recovery of the mainframe\nsystem and connectivity of the Local Area Network at an offsite facility as the\nareas to be tested on a semi-annual basis. Since 2002, the RRB has included\nsome user applications in the testing.\n\nA typical test involves verifying that the Mainframe Operating System (and all its\ncomponents) and the Local Area Network execute properly on the offsite facility\xe2\x80\x99s\nsystem. The tests include successfully restoring all production databases. A user\ngroup selects production batch jobs to test. Testing also involves setting up a\nworkstation/network environment. The user group has commented that a 24 hour\ntest does not give them enough time to thoroughly test applications, and has\nsuggested a two day annual test to replace the semi-annual test.\n\nRRB\xe2\x80\x99s disaster testing ensures that major information technology functions would\nbe operational in the event of a disaster and that benefit payments would be made\nto the current beneficiaries. However, testing offers no guarantee that other\nphases of the plan will be adequate to bring the RRB back to full operations. In a\nworse case scenario, in which both the RRB headquarters and the Chicago area\noffsite facilities are unusable, RRB benefit payments can be made by the\nDepartment of the Treasury (Treasury) based on Treasury\xe2\x80\x99s records of the RRB\xe2\x80\x99s\nprevious benefit payments. However, new applications for benefits and any\nchanges to benefits made since the last benefit payment cycle would not be\ncorrectly paid. The unprocessed work would cause delays in payments, improper\npayments, and create a backlog that the RRB\xe2\x80\x99s already strained resources would\nhave to accommodate.\n\nRecommendation\n\nWe recommend that the Director of Administration, as Chairman of the Crisis\nManagement Committee, ensure that other phases of the DRP are tested\n(Recommendation #2).\n\nManagement\xe2\x80\x99s Response\n\nManagement concurs with the recommendation. Management will expand testing\nto include all phases of the contingency plans to verify the completeness and\nworkability of the plan, to identify needed revisions to plan procedures, to\ndetermine the adequacy of training, and identify needed revisions to training\npolicies and procedures. Target date for completion of a Level 1 test will be\nMarch 30, 2007.\n\nThe full text of management\xe2\x80\x99s response is included as an appendix to this report.\n\n\n\n\n                                          5\n\n\x0cRECALL ROSTERS\n\nThe RRB\xe2\x80\x99s DRP includes an appendix called the Emergency Management\nOrganization (EMO) Recall Rosters. The purpose of these rosters is to have data\nfor all personnel assigned to EMO positions in one place. This data includes\ncontact information and the roles of critical employees that would be involved in\ndisaster recovery. These rosters include 192 of the approximately 1,000 RRB\nemployees. The last DRP test involved 19 people. Three of these people were\nnot included on the recall rosters. We discussed this situation with RRB\nmanagement, and they agreed everyone involved in testing should be included on\nthe recall rosters.\n\nThe RRB sends an annual e-mail notice to each employee on the rosters asking\nthem to confirm their contact information. However, there is no control to ensure\nthat all critical employees are included on the Recall Rosters.\n\nRecommendation\n\nWe recommend that the Director of Administration establish procedures to ensure\nall critical employees are included on the Emergency Recall Rosters\n(Recommendation #3).\n\nManagement\xe2\x80\x99s Response\n\nManagement concurs with the recommendation. The Executive Committee\nmembers have begun a review of the data in the Emergency Recall Roster to\nassess the accuracy of the information and that roster listing will be updated with\nany changes identified as a result of the examination. During the next annual\nupdating cycle of the Emergency Recall Roster, procedures will call for a positive\nconfirmation response of team membership by team leaders. Target date for\ncompleting the current review and update will be August 15, 2006.\n\nThe full text of management\xe2\x80\x99s response is included as an appendix to this report.\n\nTRAINING\n\nThe DRP describes overall training objectives that cover a wide range of outcomes\nfrom simple awareness of the major provisions of the plan to the ability to carry out\nspecific procedures. These objectives require the trainees be able to:\n\n   \xe2\x80\xa2   describe the recovery organization (teams and functions),\n   \xe2\x80\xa2   explain the flow of recovery events and activities following a disaster,\n   \xe2\x80\xa2   state one's own responsibilities in recovery activities, and\n   \xe2\x80\xa2   perform assigned procedures.\n\nThe DRP calls for a training schedule with:\n\n   \xe2\x80\xa2   initial training immediately upon assignment to a team,\n\n\n                                          6\n\n\x0c   \xe2\x80\xa2\t   refresher training on an annual basis, and\n   \xe2\x80\xa2\t   remedial training when determined necessary following a test/exercise.\n\nDiscussions with three employees on the recall rosters disclosed that they had not\nreceived any disaster training since the plan had been developed. Two individuals\nwere not aware of their roles as specified by the recall roster. One person\nquestioned if the role specified for her was appropriate.\n\nRecommendations\n\nWe recommend that the Director of Administration:\n\n   \xe2\x80\xa2\t revise future annual recall roster email notices to include the employee\xe2\x80\x99s\n      membership on DRP teams and their role/duties and request that the\n      employee review the information to ensure it is correct (Recommendation\n      #4); and\n\n   \xe2\x80\xa2\t ensure that the DRP training plan is followed (Recommendation #5).\n\nManagement\xe2\x80\x99s Response\n\nManagement concurs with the recommendations. During the next updating cycle\nof the Emergency Recall Roster, procedures will call for reviewing and verifying\nthe accuracy of DRP team membership along with the contact information. Target\ndate for completing the roster update will be February 28, 2007.\n\nManagement will develop procedures for DRP team leaders to annually meet with\ntheir respective team members to ensure that each participant understands their\nroles and duties in the event that the plans need to be executed. Target date for\ncompleting this will be March 30, 2007.\n\nThe full text of management\xe2\x80\x99s response is included as an appendix to this report.\n\nOTHER CONSIDERATIONS\n\nThe RRB\xe2\x80\x99s current DRP was finalized in 2003. Although this is recent, the 2005\nHurricane Katrina in New Orleans showed how vulnerable federal disaster plans\ncould be. Issues that arose there should be addressed in updates to the RRB\xe2\x80\x99s\nDRP.\n\nDirectors of Federal Executive Boards across the country, and managers from\nseven Atlanta-area agencies met in March 2006 to discuss lessons learned from\nKatrina. The following is a summary of lessons learned:\n\n\n\xe2\x80\xa2\t Agencies should change their entire paradigm to be less focused on\n   information technology and more on business recovery and deployment of\n   people.\n\n                                          7\n\n\x0c\xe2\x80\xa2\t Federal managers say the most unexpected problem following Hurricane\n   Katrina was the collapse of the region\xe2\x80\x99s telephone infrastructure. None of the\n   three area codes in and around New Orleans worked. Not only could\n   managers not contact employees, but employees could not contact their\n   families, compounding the already stressful situation.\n\xe2\x80\xa2\t Most agencies have call-in numbers for sharing emergency information, but\n   those rely upon a functioning communications infrastructure. Experts\n   recommend an out-of-town number for employees to call during emergencies.\n   Making sure employees know where to call and where to get information and\n   changing instructions is vital.\n\xe2\x80\xa2\t Agencies need to maintain updated contact information on employees and\n   develop alternate means of reaching workers before and after emergencies.\n   Employees should provide agencies with the numbers of out-of-town relatives\n   or other contacts.\n\xe2\x80\xa2\t Another problem for managers after Katrina was the lack of a single database\n   for tracking down where people had fled. Even though they may not be\n   essential, we need to know where they are so we can bring them back to the\n   federal work force if necessary.\n\xe2\x80\xa2\t Agency emergency plans must also account for employees\xe2\x80\x99 families. Family\n   emergency plans should be a key part of every agency\xe2\x80\x99s emergency plans,\n   said FEMA. The San Francisco Federal Executive Board hosted a training\n   session on family support planning in March, and family considerations are\n   being incorporated into its July disaster-response exercise.\n\xe2\x80\xa2\t Experts say agencies should prepare contracts and agreements for housing\n   ahead of time. Most agencies now typically make arrangements only for\n   alternate worksites, back-up computers and IT services.\n\xe2\x80\xa2\t After Katrina, many agencies learned the value of teleworking. As a result,\n   disaster recovery planners are trying to incorporate teleworking into their plans\n   and exercises.\n\xe2\x80\xa2\t In their planned July exercise, San Francisco agencies are going to simulate a\n   pandemic, such as an avian flu outbreak. In the past, emergency plans were\n   based on getting key personnel to an alternate work site. But if a pandemic\n   broke out, the priority would be preserving public health. Because employees\n   would be working from home and tending to their families, agencies need to be\n   more proactive about setting up telework arrangements.\n\nRecommendation\n\nWe recommend that the Director of Administration address the above issues and\nconsider how the RRB can update the DRP and disaster testing to better prepare\nfor a disaster (Recommendation #6).\n\n\n\n\n                                         8\n\n\x0cManagement\xe2\x80\x99s Response\n\nManagement concurs with the recommendation. Management will review the nine\nlessons from the Hurricane Katrina experience and plan further modifications\ntaking into consideration how those recommendations may better prepare the\nRRB for a disaster. Target date for the revised BCP is December 1, 2006.\n\nThe full text of management\xe2\x80\x99s response is included as an appendix to this report.\n\n\n\n\n                                         9\n\n\x0cAppendix\n\x0c\x0c"